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ABSTRACT 



A multiple frequency quaternary phase shift keyed signal 
is generated using a complex Fast Fourier Transform on an 
industry standard personal computer and is output using direct 
memory access through a digital to analog converter. The 
output is permitted at rates of up to the maximum direct 
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output are compared as alternate solutions to the problem of 
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I . INTRODUCTION 



A. BACKGROUND 

Over the past several years a multiple frequency quater- 
nary phase shift keyed (MFQPSK) communications system has been 
developed at the Naval Postgraduate School (Proctor , 1985) 
(Gray, 1986) (Whitacre, 1986) . Several implementations have 
been proposed which require specialized hardware construction 
with no flexible means of actually generating the signal 
instead requiring the signal to be generated offline. A 
simpler implementation housed in the industry standard 
computer was desired. The benefits of using a general purpose 
personal computer as the driver were anticipated to be that 
the transfer of data would be simplified if the output device 
was physically connected to the generation device and that the 
technique could be simply replicated by others desiring to 
generate such signals. 

B . MFQPSK SIGNAL 

The MFQPSK encoding is similar to that currently used by 
1200 baud modems using the Bell 212A Standard differential 
phase shift keying (DPSK) in that data is represented by one 
of four phase shifted versions of a single frequency (tone) 
in each baud of a transmission. A baud is a discrete grouping 
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of information, in this case, capable of representing two 
binary bits of information. 

A four baud sequence of the four possible phase choices 
on a single tone is demonstrated in Figure 1 where the 
relative phase selections are +0 degrees, +180 degrees, +270 
degrees, and +90 degrees. Note that the shift is referenced 
to the previous phase position, not the initial position of 
a +45 degree phase shifted sinusoid. 




Figure 1. Four Baud Sequence 



The modem technology does not directly shift the tone's 
phase but rather uses the algebraic addition and subtraction 
of a sine wave and cosine wave, the quadrature or orthogonal 
components, of the tone to modulate this phase information as 
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is shown in Figure 2 which represents the third baud of the 
above sequence — a total of +135 degrees shift, or a tone 
shifted to the second quadrant. 




While this method is employed reasonably well with one 
discrete frequency (or two for the case of a full duplex 
modem) it is not very practical for multiple frequencies. 

A method of modulating frequencies with phase information, 
which is suited better to many frequencies, is the frequency 
spectrum to time signal transformation of the Fast Fourier 
Transform (FFT) . 

Figure 3 demonstrates this principle for a single fre- 
quency and a phase in the first quadrant (a phase shift of +90 
degrees) of a 16 frequency bin system and introduces a new 



3 




graphical convention where the frequency spectrum is displayed 
showing both phase and magnitude on a single graph. Each 
frequency location is a polar plot with the line length 
representing the magnitude of a frequency and the angular 
displacement representing the phase shift. A vertical line 
would indicate a positive purely imaginary frequency. The 
FFT method is the approach taken to generate the MFQPSK 
signal. The phases are specified in their appropriate 
frequency bins which, in turn, predetermines, for real 
signals, the phases for the bins symmetrically located about 
the half sampling (Nyquist) frequency bin of the transform. 
The spacing of these frequency bins (or frequency resolution) 
is determined by dividing the Nyquist frequency by the number 
of bins on each side. Sampling time is the direct inverse of 
this frequency bin size. An example that encodes information 
in four frequencies per baud in a 16 frequency bin system is 
shown in Figure 4. If the resulting transformed signal has 
the sampled-and-held signal updated with a new value at a 
frequency of, for example, 60 Hz, the Nyquist frequency is 60 
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Hz, the top frequency bin represents 120 Hz and each of the 
frequency bins have a size of 7.5 Hz. 




Following the Inverse Discrete Fourier Transformation of 
such a signal, the completely real time domain signal, which 
consists of a series of 16 discrete steps made continuous by 
holding the sample until the next discrete step is due, is 
transmitted carrying phase information in each of the discrete 
frequencies simultaneously. The overall signal closely 
resembles bandlimited white noise. Figure 5 shows the time 
domain signal of a typical 256 tone, 4096 bin packet or baud. 

C. OVERVIEW 

The purpose of this study was to design, optimize, and 
construct a prototype output expansion board for the personal 
computer bus. The sister design problem of constructing an 
expansion board for the received signal is the subject of a 
concurrent thesis study, although to a large extent the 
problem is greatly simplified by choices made on the output 
problem. Chapter Two addresses the communication system as 
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Byle Number 

Figure 5. Time Domain Signal of a 256 Frequency 
Signal in a 4096 Bin System. 



a whole and discusses the required hardware and software 
elements to make the system viable. The many elements of 
interaction and interdependence are discussed to enable the 
reader to develop a base of understanding for the choices made 
in the following chapters. Implementation options are 
discussed along with details of the experimental process in 
Chapter Three. The conclusions and recommendations for 
further study are the subjects of Chapter Four. 
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II. PC BASED MFQPSK COMMUKICATIONS 



A. SYSTEM COMPONENTS 

The major components in a personal computer based MFQPSK 
communications system are: (1) data input, (2) Fast Fourier 
Transform mechanism, (3) memory access system, and (4) digital 
to analog conversion. 

The organization of these elements is coordinated by a 
main high level language program which calls subroutines to 
initiate the individual operations required. 

B. DATA INPUT 

The main program's first task is to retrieve the message 
character set and perform the frequency encoding. A 
subroutine would typically perform this task. In the final 
system implementation, it is supposed that the entire message 
to be encoded is readily available for translation. Messages 
are prepared independently of the communications system, but 
quite possibly with an editor on the same physical computer, 
with characters being represented in American Standard Code 
for Information Interchange (ASCII) format and placed together 
as a standard Microsoft Disk Operating System (MS-DOS) file 
on one of the computer's disk drives. 
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The encoding process must take care of several details to 
be used effectively: (1) It must confirm an appropriately- 
sized message, for example 4096 characters, fill the unused 
character area with a predetermined idle character if too 
short, or break the message into packets, each of which can 
be contained within the size constraints provided, if too 
long. If the message is broken into parts, the main program 
must have a means of automatically transmitting multiple 
segments. (2) The primary memory needed for the FFT subroutine 
parameters and scratch area must be dynamically allocated 
prior to the subroutine call and deallocated following. The 
exception to the deallocation is the case where several 
packets are being prepared for transmission as a single 
message. Allocated areas represented by a pointing vector 
would be maintained in a list of areas in main memory to be 
output. When the transmission started, each area would be 
transmitted in turn, with pointers representing that area 
being removed from the message's list of pointers and sent to 
the transmitting routine. (3) Appropriate phase and frequency 
information must be assigned based on the message given, 
retained and assigned the complex conjugate values in the 
corresponding symmetrically located frequency bins. (4) 
Finally, the encoding subroutine must initiate the FFT 
subroutine call if the FFT is to be completed in software or 
must pass the appropriate primary memory location to the main 
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program if a hardware FFT is to be accomplished. If a 
multiple packet message was to be sent an FFT would be 
conducted repetitively by moving through the message list of 
pointers . 

C. FAST FOURIER TRANSFORM 

Once the process has the data encoded into frequency and 
phase information, the FFT must be performed. As mentioned 
in the previous section, this can be accomplished either in 
hardware or software. 

In the hardware choice, the data must be passed from the 
computer primary memory to the expansion board where a 
provision will have to be made at this point for an off 
computer secondary storage method. A hardware FFT, to utilize 
its speed advantage over a software FFT, must position the 
entire array of data elements in such a way as to allow all 
data to enter the parallel FFT process simultaneously and then 
be able to output the data in a byte serial fashion for final 
digital to analog conversion. 

In the software FFT case, the main program must now call 
the FFT routine. FFT routines typically are passed an address 
pointer to the data location and return the transformed data 
in the same primary memory space. 
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D. MEMORY ACCESS 



Regardless of whether the data is transferred from the 
primary memory to the expansion board before or after the FFT 
is performed, a decision about the means by which it is 
transferred must be made, unless the computer's memory is to 
be paralleled with memory placed on the expansion board or the 
hardware device is to have some bus authority, possibly as a 
parallel processor. Paralleling memory has hazards which will 
be discussed a little in the next chapter on choices. In 
general, however, there are three reasonably routine methods 
of moving the data off computer: (1) high-level language 
transfer; (2) assembly-language transfer; and (3) direct 
memory access (DMA) . Each of these choices lead to different 
paths, each requiring further choices to be made. One major 
factor in the decision, common to all choices, is the com- 
puter's own housekeeping problems, specifically, the memory 
refresh requirement and the nature and means of executing 
instructions . 

E. HOUSEKEEPING 

1 . MEMORY REFRESH 

Computers use dynamic random access memory (DRAM) for 
their main memory. DRAM consists of a bank of single 
transistor elements which are conveniently modelled by a bank 
of capacitors having either a charged or depleted state 
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representing a logic one or zero respectively. As with all 
capacitors, the charge will tend to leak over time, with the 
less desirable devices having less internal capacitance, and 
therefore less ability to hold their charge for a specified 
time. These less desirable devices require recharge or 
refresh more often than the more desirable ones. A typical 
engineering judgement made by the memory device manufacturer 
sets the time between refresh cycles at 15.125 microseconds. 

This refresh is carried out automatically by a single 
DMA channel. Since the priority of the refresh is higher than 
any other computer operation, including that of the main 
program execution, allowances or corrections must be made to 
prevent a negative impact on the program's performance. 

2. INSTRUCTION FETCHING AND EXECUTION 

Instructions are placed in a section of main memory 
along with the operands they may require for execution. The 
instructions have varying lengths and are fetched by the main 
processor one byte at a time. The rate for this instruction 
fetch is one byte every four clock cycles because it is 
basically a memory fetch. Therefore, the instructions which 
consist of several bytes may take many clock cycles to load 
into the processor before they can be executed. 

One of the innovations of the Intel 8088 over earlier 
processors was the implementation of an advanced fetching 
mechanism where instructions can be fetched in advance of 
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their being needed by the processor. A total of four instruc- 
tions may be prepostioned like this. One difficulty in 
placing the instructions in the processor in advance is that 
if a program loop occurs or program control is passed to 
another location by a subroutine call, the processor is ready 
to execute instructions in the old path. In these cases, the 
instructions are flushed from the processor and the process 
is restarted. This time consuming process can actually slow 
down the execution of a program directly following a branch 
or loop. 

Instructions are programs contained within the 
microprocessor. These little programs are called microcode. 
Execution times of these microcoded programs depend on the 
number of operations to be performed. Some instructions are 
fairly straightfoirward and consume few machine cycles. 
Examples of short instructions are register to register 
transfers or arithmetic rotates which typically require a 
single arithmetic logic unit cycle. The two instructions 
cited take only two clock cycles of the main processor to 
accomplish. Others such as the interrupt instruction are 
extremely complicated and perform many functions inside the 
processor. It manipulates the stack pointer six separate 
times, as well as manipulating a number of other registers and 
takes over 50 clock cycles to perform. 
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Regardless of the instruction or its length, the 
processor will complete the instruction in its entirety before 
passing bus control to another process. 

3 . HARDWARE INTERRUPTS 

In addition to memory refresh, processor program 
control may be set aside by the occurrence of hardware inter- 
rupts. The three interrupts used by the computer are: 
Interrupt 0 which is invoked every time the Intel 8253 
programmable interval timer count reaches zero. Interrupt 1 
which is invoked every time a key is depressed on the system 
keyboard, and Interrupt 6 which reports a disk access 
complete. 

These interrupts are maskable and can be disabled by 
selecting the appropriate mask at Port 21h (address of the 
interrupt controller) . 

4 . SYSTEM IDLE 

Any program controlled machine must either be held in 
a halted state or it must have a series of instructions to 
perform. The halted state is used whenever another processor 
has control of the bus as in the case of a DMA transfer. More 
often than not, however, some input is required from the 
keyboard for the operating system to perform its next func- 
tion. The input required may be a data input for a program, 
or may be a command for the operating system itself invoking 
a program. 



13 



The program loop used by the system while waiting for 
input contains the two instructions at each end of the 
instruction execution time spectrum — a two clock cycle 
instruction and a 52 clock cycle instruction. This program 
loop keeps checking to see if any entry has been made to the 
keyboard. Since the time to type a key is so large compared 
to the time it takes to perform instructions, the keyboard 
interrupts the process and deposits the representation of the 
key in a buffer in memory and, if the key is the first to be 
placed in an empty buffer, sets a flag to indicate something 
in the buffer. 

The read-only memory (ROM) basic input/output system 

(BIOS) contains the instructions which tell the computer how 

to perform this vital service of waiting for an input. The 

ROM-BIOS is proprietary and changes from machine to machine 

but in all machines the program loop looks something like the 

following (the numbers in parenthesis indicate the number of 

clock cycles required for the instruction to execute) : 

mov ah,l ; request service #1 (4) 

recheck: int 16h ;anything from keybd?(51) 

jz recheck ;zero flag means no (16) 
mov ah,0 ; request service #0 (4) 

int 16h ;places char in ah,al(51) 
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Int 16h looks something like: 

cmp flag,l ;true flag means keybd(9) 

; hardware interrupt # 1 
;has occurred — its 
; int service routine sets 
;the flag 

iret (32) 

F. DIGITAL TO ANALOG CONVERSION 

All implementations of this system perform the digital to 
analog (D/A) conversion on the expansion board off computer. 
Since the conversion is made continuously, some decision must 
be made regarding the method of latching the data and 
selecting the specific time to activate the latch. Once data 
is latched the conversion is made to an analog signal. The 
design must specify the form of the analog signal to be used. 
The three decisions required are: (1) the no signal voltage 

level; (2) the voltage maximum to minimum range; and (3) the 
level of accuracy chosen. One of the given parameters in this 
study was that an eight bit conversion was sufficiently 
accurate. This was a convenient choice since memory data is 
eight bits wide and the INTEL 8088 transfers eight bits at a 
time . 



15 



III. IMPLEMENTATION OPTIONS 



A. DATA INPUT 

Since the encoding process is currently unresolved and the 
subject of further research in another thesis project, a test 
harness called ENCODEDATA was written which allowed the 
selection of the signals for a test of the remainder of the 
system. ENCODEDATA is the first subroutine called in the main 
program which is located in Appendix A. The subroutine 
prompts the user to choose the phase of the selected frequency 
bin or makes provision for a random assignment of frequency 
phase information for all assigned frequency bins. It 
maintains a record of the choices made so that the phases of 
the symmetrically located frequencies are properly chosen to 
maintain a real signal output following the FFT. 

B. FFT IMPLEMENTATION 

One of the first choices made in the design process was 
whether to implement the FFT in hardware or software, because 
of the number of other system characteristics resting on the 
decision. Arguably there exists a third option, that the FFT 
could be calculated off the computer with the results for 
transmission submitted as a file. If such a choice were made, 
however, the main reason for having a personal computer-based 
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system, the ability to have a completely integrated system, 
would be violated. 

The concept design provided at the start of this project 
assumed that the transformation of the data to be transmitted 
would be received and encoded by an offline FFT being provided 
to the computer simply as a file of bytes to be transmitted. 
At the time this author joined the research project, the FFT 
was being accomplished in the A Programming Language (APL) 
language off-line with the resulting data file being 
transferred to the output device which was a simple barrel 
shift register feeding a D/A converter. 

The decision was made to incorporate the FFT into the 
computer system to allow a completely integrated system. A 
hardware FFT implementation was rejected because of the 
current lack of a single chip 4096 point FFT device. A few 
notes on the current work in this area are included in the 
conclusions and recommendations area. 

Following the decision to use a software FFT implementa- 
tion, the algorithm and computer language needed to be chosen. 
As was mentioned earlier, the preliminary research used APL 
which is an interpretive rather than a compiled language. The 
author chose to use a much faster compiled language such as 
Pascal, C or Fortran for the final implementation. 

One benefit of using a compiled program in place of an 
interpreted program is that, for an interpreted language, much 
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more memory is required to be in use to support the program. 
Interpreted languages must keep every command structure 
available for use. A compiled program typically picks only 
those function categories required for the specific applica- 
tion to be loaded into memory for the program to execute 
successfully. For example, if screen output is not required, 
all routines concerning screen interface may be omitted from 
the compiled version of the program with no ill effect. 

The biggest effect, though, is speed. The author found 
a standard Turbo Pascal FFT routine would run at about one- 
half of the time cost of using APL.^ 



The 


decision was made to use 


an 


FFT routine from a 


standard 


library. 


The FFT subroutine 


used 


for the 


test 


harness 


is called 


ComplexPFT and 


is 


part 


of the 


file 



PFT87B2.INC available as part of the Borland Turbo Pascal 
Numerical Methods Toolbox.^ The Cooley-Tukey Algorithm is 
used in this implementation and requires approximately 22 
seconds for a 4096 point FFT on an 8.0 MHz zero wait state 
machine with an Intel 80287 math coprocessor. The FFT is 
performed prior to the beginning of the message transmission. 



^The actual times were 22 seconds for the Turbo Pascal versus 
57 for the APL using the same machine with the most efficient 
algorithms known. 

^Source code for this subroutine is not included because the 
license will not permit reprinting the source code. A program 
written by the license holder is permitted to include a compiled 
copy of the subroutine. 
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As with many FFT routines, the transformed signal is returned 
to the calling program in the same matrix (this also means the 
same primary memory locations) in which the input frequency 
and phase information were passed. 

Flexibility was the factor which the author used to decide 
on the specific compiled language with all other elements 
being equal. FFT subroutines using the same basic algorithm 
were readily available in Fortran and Pascal, and could have 
easily been translated into C. Pascal was chosen over Fortran 
because of the ease in handling input and output and because 
of the fact that, since its entry (particularly the entry of 
TURBO PASCAL by Borland International, Inc. which dramatically 
changed the pricing structure for programming languages) into 
the personal computer arena, a wealth of standard routines 
have become available. All software needed for the develop- 
ment of this project had a cumulative cost of approximately 
$200 at commercial retail rates. ^ 



^This cost included the Turbo Pascal Numerical Methods 
Toolbox, of which the FFT and related routines are a part, the 
Turbo Assembler and the Turbo Debugger, which were used for the 
Assembly language subroutines for initialization and startup of the 
DMA process, and Turbo Pascal, which was used for the main program 
home for the system. The cost of the operating system (approxi- 
mately $100) is not included in these estimates. 
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C. DATA OUTPUT TECHNIQUES 

Given that the FFT was to be performed in software, the 
next step in the process was to decide on the method of 
getting the FFT results from internal memory to the D/A 
converter. The largest problem initially considered in this 
area was that of accommodating the memory refresh overhead for 
the primary memory. Several approaches were examined: (1) 
postpone the refresh such that the entire block of data (4096 
bytes) could be transmitted in between refreshes; (2) replace 
the dynamic memory with static memory which requires no 
refreshing; (3) use off computer static or bubble memory, 
loading the memory off computer in parallel with the on 
computer memory, and then isolating the computer bus during 
the transmission of the output signal; (4) use off computer 
static or bubble memory, transferring data to the off computer 
memory and then transmitting; and (5) use DMA in the single 
byte mode so that memory refresh could be accomplished in a 
normal manner. 

The options were considered in much the order listed 
above. The following sections detail the process through 
which the decision was made. 

1. POSTPONING MEMORY REFRESH 

As mentioned in Section II. E, the frequency of this 
refresh depends on the capacitance of the poorest quality 
device in the system's memory. The computer uses DMA channel 
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zero to perform the refresh, one block at a time, initiated 
by a pulse from the system timer (INTEL 8253) . Each DMA 
channel has a priority assigned based on the number of the 
channel, with the lowest number, channel zero, having the 
highest priority. Refresh excludes the processor from the 
system bus for the period of refresh. The DMA controller 
keeps track of the address of the memory block to be refreshed 
next so that a new block is chosen each cycle. The cycle 
takes five computer clock cycles to complete. 

The refresh time was compared with the most 
straightforward means of outputting data from the primary 
memory — an assembly language program loop. Appendix B 
contains a simple nine line loop program (Eggebrecht, 1983, 
p. 188) which was used in this test. A total of 4 3 clock 
cycles are required in the case of an INTEL 8088 processor to 
accomplish one pass through the loop's instructions. A test 
circuit was designed and used to verify the output timing. 
This operational test of the first circuit design also 
provided the first look at the effect of the memory refresh 
on a controlled output. The test waveform was extremely 
j ittery . 

The reason for the jitter was that while the loop 
consisted of 43 clock cycles, the main processor turned 
control of the busses over to the DMA channel zero memory 
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refresh every 121 cycles for a period of five clock cycles.^ 
The net result of this periodicity conflict was that sometimes 
two data bytes and other times three data bytes would be 
output between refreshes. 

The first attempt to correct the jitter resulting from 
the memory refresh problem was to manage its timing to suit 
the system needs. There is a wide range of timings permitted 
for the memory refresh. 

As mentioned previously, the memory manufacturers 
specify a typical refresh requirement of 15.125 microseconds. 
This is not a hard and fast rule, however. A recent article 
in the PC Magazine (Roemmele, 1988, pp. 331-346) discussed a 
method of manipulating the memory refresh period. Experimen- 
tation on Naval Postgraduate School machines yielded an 
experimental result that the refresh cycle could be extended 
to 30 milliseconds for all machines with some select machines 
being extended up to 0.3 seconds without parity error. The 
PC Magazine article cited an example of a machine which could 
be refreshed as infrequently as 1.001 seconds! The assembly 
language program cited in the article and used in this test 
is included in Appendix C. 



^The times listed here are for the 8.0 MHz personal computer 
using an Intel 80286 processor used in this project. Refresh is 
based on time, not instruction cycles, whereas instructions are 
just the opposite. 
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since the refresh period could not be expanded to 
include an entire block transmission in this mode, the refresh 
period was reduced to match one byte of data transfer. Every 
cycle of the program loop which output a single byte was 
followed by a memory refresh. 

The real difficulty of using the program loop was its 
severe inflexibility. The entire communication system must 
be designed for a specific class of machines running at a 
specific frequency. If for some reason the output frequency 
had to be changed, the refresh period would have to be changed 
and the program loop would also have to be modified to include 
more or less no operation (NOP) statements. The range of 
output frequencies available under this method for an 8.0 MHz 
machine was limited to a low of approximately 66 KHz because 
of the memory refresh requirement and a high of approximately 
14 5 KHz because of the time in executing statements in the 
program loop. 

The first design, although technically meeting the 
requirements of the system, clearly needed more flexibility. 

2. PRIMARY MEMORY REPLACEMENT 

If static ram (SRAM) were used for primary memory, 
memory refresh would not be an issue. SRAM is a much more 
complicated structure which uses, instead of a single transis- 
tor holding a charge, eight transistors forming a flip-flop. 
This typically requires much more chip real estate and, as a 
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result, costs much more per byte than the DRAM. The obvious 
next question was, "Why not just replace the DRAM with SRAM 
and not worry about the refresh?” Unfortunately, the two 
different types of chips are not interchangeable. A SRAM chip 
manufactured with the same technology as a DRAM chip will have 
less memory in the same package size. SRAM built under newer 
technology allows more compact placement of the circuitry and 
has a difference in the pinout which prevents a direct 
replacement. One option in dealing with this problem was to 
build a memory expansion board with SRAM and disable the old 
memory and refresh buses. This option was not attempted 
because of the drastic nature of the solution. It should be 
noted, however, that the frequency range under this option 
would lose its lower end limitation. 

3. ADDING EXTERNAL COMPUTER MEMORY 

Many times a simple conceptual design becomes greatly 
burdened when it gets to the point of real world implementa- 
tion. The idea of paralleling memory is one of those things 
that sounds like a simple solution to the problem of trans- 
ferring data but ends up being very complicated. The concept 
is that corresponding memories on the buses would be paral- 
lelled for the calculation part of the program and then 
isolated by means of a latched address decode following the 
FFT calculation. The expansion address and data buses would 
have to be isolable from those of the main processor. To meet 
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the requirement that memory refresh would not affect the 
output, an entire second system of output address program 
control would have to be implemented to select the particular 
byte to be output at any given time during the second phase 
of the process where the busses were isolated. A 4096 byte 
barrel shifter was found® which could handle the output 
problem of the second phase reasonably well without a separate 
program counter, but had no means for being randomly addressed 
by the processor during the first phase. Many suitable random 
access devices were found which could meet the first phase 
requirements but had no reasonable way of producing the output 
independent of the processor's program control. 

In addition to the complicated construction required, 
flexibility is again of concern. Only one memory region is 
allowed for the placement of data coming from the FFT. Any 
other program using the computer has to be concerned about the 
meaning of the latched decode address for isolating the 
expansion board and its conceivable initiation of the output 
process . 

The research path that this series of problems seemed 
to propose next was to duplicate the memory in question 
without parallelling. This would allow the main processor to 
execute program control for loading the external barrel 

®The IDT7M204 chip was manufactured starting in 1985 by the 
Integrated Device Technology Company, Incorporated. It uses a nine 
bit wide data first in first out (FIFO) array of 4096 bytes. 
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shifter and then pass control for the output directly to an 
oscillator which would gate the output on the expansion board. 
The advantages of this modification were to permit the 
computer housekeeping to proceed without any interference and 
to allow the output process to operate with only the 
limitations of the output device itself. As an example, the 
IDT7M204 device would allow a range of output frequencies 
between 0.1 Hz to 12.0 MHz. 

The disadvantage was that the design still required 
a means of transferring the data from internal to external 
memory and imposed a limitation on the length of a continuous- 
ly transmitted message to the length of the available external 
buffer — in this case only one block or packet of data. 

Still, this approach was the solution pretty well 
settled upon until the problem of memory to memory transfer 
of data was taken on in earnest. It seemed reasonable that 
DMA was in order because raw speed was the prime concern in 
getting the data to the output device. However, once the 
study of the various DMA options was started, it became 
apparent that a DMA technique might well hold the answer for 
a direct output to the converter without the bother and 
inflexibility of an intermediate memory device. 

Additionally, if several packets were to be transmitted 
in a single message, the transmission would have to stop while 
the intermediate device was reloaded from main memory with the 
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next packet of information. The DMA technique would require 
only changing the base pointer in the DMA controller to effect 
the location of the subsequent data for output. 

4 . DMA OUTPUT 

The DMA controller (Intel 8237) has two general modes 
of operation — block and byte. When performing a transfer in 
the block mode, no maskable interrupts or other coprocesses 
are allowed to take control of the bus. This can result in 
a failure of the refresh if the blocks are too large. The 
computer manufacturers specify this mode as unallowed for that 
reason, even though technically the block mode can certainly 
be programmed. When a block transfer is initiated, the 
computer automatically transfers the block at the maximum DMA 
rate which varies among the various computer manufacturers. 
This transfer takes six clocks because of a wait state 
inserted by the baseband logic to accommodate slower memory 
(Eggebrecht, 1983, p. 115). The hardware designer also has 
it within his ability to insert additional wait states if 
needed to slow the process down. For a block of 4096 bytes 
and an 8 MHz machine, this could fall within the bounds of the 
extended memory refresh. If this method were to be used for 
the system 4096 byte block, the transfer could occur at an 
incredible byte rate of approximately 1.3 MHz taking a little 
over three milliseconds to transfer the entire block! 
Unfortunately, if transmission is desired at a slower rate. 
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the issue of flexibility again raises its head. Since the 
4096 byte transfer pushes the system to its practical refresh 
limits, any solution requiring additional wait states to be 
inserted extends the cycle beyond acceptable limits and will 
certainly result in parity error failures of the main memory. 

This mode was considered with a simple modification 
to the system of halving the size of the block of data. This 
approach allowed the data to be transferred at any rate 
between 650 KHz and 1.3 MHz by placing the I/O CH RDY line in 
the inactive low state while waiting for the next clock. By 
reducing the block size to half, the refresh requirements 
continued to be met. The difficulty in this approach had to 
do with the specific system being implemented which needed a 
wide number of frequency bins available in the lower frequency 
spectrum. As the frequency of byte transfer increased, the 
size of the frequency bins increased as well. This, coupled 
with the restriction that the block size be cut in half, 
drastically reduced the number of frequency bins available in 
a given low frequency response region. For example, a 
frequency response of 50 Hz to 15 KHz (such as a normal 
acoustic channel) , sending information with a 2048 bin system 
at a sampling rate of 650 KHz, has a bin size of approximately 
635 Hz and only 23 frequency bins available for assignment in 
the permitted frequency range. Contrast this with the 4096 
bin system sampled at 60 KHz which has a bin size of 
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approximately 29 Hz and 515 bins available for encoding within 
the specified frequency response. 

The DMA single byte transfer mode, even though it 
takes a few more clock cycles to perform, offered more 
flexibility in the design. The flexibility comes from the 
ability to vary the frequency over a wide range of values by 
requesting a byte transfer almost at will. Each byte is 
output after an active request has been asserted on the 
appropriate DMA request line. 
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When the choice was made to implement this design, it 
was expected that the range of allowable frequencies would be 
from 0.1 Hz to 880 KHz. This derived from the fact that, in 
this mode, the main processor is allowed an instruction cycle 
(four clock cycles) in between every DMA byte transfer (six 
clock cycles) . Thus only about half of the instruction 
bandwidth is utilized for the DMA transfer. Unfortunately, 
the background process detailed in Section II. E. 2 consisted 
of instructions much longer than the four clock memory access 
cycle. Given the longest instruction of 51 clock cycles (INT) 
and the six clock DMA cycle, the highest sampling frequency 
with a 100 per cent probability of noninterference was 140.35 
KHz. Even given the fact that the signal loses no pulses, the 
signal is not truly stable. Because of the differing lengths 
of instructions in the system idle loop, the latched signal 
is transmitted at times which fluctuate slightly around the 
DMA request. This effect is seen in Figure 6 and is not a 
significant problem. 

An incomplete summary of the events causing the 
sinusoid perturbation shown in Figure 6 is listed below with 
the timing references signifying the number of elapsed clock 
cycles : 

Cycle Event 

0 INT instruction occurs and requires 51 
clock cycles to complete. 

1 DMA requests the bus for byte transfer. 
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50 

57 

57+ 

67 

74 

107 



114 

116 

123 

171 

175 

182 

192 



DMA cycle starts. 

CMP instruction occurs and requires 9 
clock cycles to complete. 

DMA requests the bus for byte transfer #2. 
DMA cycle #2 starts. 

IRET instruction starts and requires 32 
clock cycles to complete. 

JZERO instruction starts which flushes 
the instruction pipeline but causes no 
problem since it requires enough time 
for execution to allow pipeline to refill. 
It requires 16 clocks to complete. 

DMA request #3. 

DMA cycle #3 starts. 

INT instruction starts (51 clocks) . 

DMA request #4. 

DMA cycle #4 starts. 

CMP instruction starts (9 clocks) . 

IRET instruction starts (32 clocks) . 



Note that the very worst case is the starting point 
for the process, yet, by the second data point the signal is 
stable at an optimum value within a very narrow time differen- 
tial. Since there are two instructions in the loop with 
substantially shorter instruction lengths than the troublesome 
INT instruction, there is a significant cushion available to 
the process to buffer the effects of the different execution 
lengths . 
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When viewed on the oscilloscope the output appears 
stable up to approximately 250 KHz. This is due to the fact 
that the timing of the computer is completely independent of 
the timing of the crystal oscillator for the expansion board 
allowing the instruction performance to be much more well 
behaved than the worst case cited above where the longest 
possible instruction occurs just the instant before the clock 
pulse requesting a byte transfer. Depending on the level of 
redundancy in the code, the errors introduced by operating in 
the range between 140 KHz and 250 KHz may be acceptable. 

Another interesting possibility arises when the output 
is accomplished at the lower rates. Less than two percent of 
the instruction bandwidth of the computer is utilized at 140 
KHz. This leaves most of the computer's capabilities free to 
accomplish other tasks in parallel with the message 
transmission. 

The memory refresh is easily accommodated in this 
method as well since a window is opened every byte transfer 
cycle which permits another higher priority process such as 
the refresh to take place. 

The DMA must be initiallized when its use is to be 
invoked. Initialization is accomplished by writing data to 
several registers contained within the DMA controller. In the 
personal computer design, the address space starting at 
Address 0 is decoded and sent to the DMA controller. This 
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application uses an assembly language routine disguised as a 
Pascal subroutine named DMAIMIT (for DMA initialize) . This 
subroutine, which is located in Appendix D, initializes the 
base address of the matrix, the number of points to be output, 
the byte type of memory transfer, and the channel to be 
utilized. It was written closely following a non-working 
version used as an example in (Sargent and Shoemaker, 1984, 
p. 246). 

Since the DMA controller is unable to address the 
entire address space of the computer, the DMA page register, 
a device separate from the controller, is initialized in the 
same subroutine with the source matrix address information. 
The address which decodes to this device is 80 hexadecimal. 

DMAINIT is called as the last substantive step of the 
main program immediately following the SCALEDATA subroutine 
which ensures that an appropriate output level is obtained. 
The parameter which determines if a single block is to be 
repetitively output would probably be chosen differently in 
the system once a real encoded signal is being output. This 
parameter is a part of the word output to location dma+11. 
In this case, the DMA was initialized to automatically return 
to the same data block when the terminal count of bytes 
transferred was reached so that a nonchanging signal could be 
observed on an oscilloscope. 
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The current expansion board design has a DMA request 
occurring on the appropriate channel at a frequency determined 
by the oscillator input. This repetitive request is the 
result of an asynchronous clock running to the DREQ3* line. 
(The * symbol following the signal name indicates that it is 
an active low signal — in this case the request is being made 
when the signal is a logic zero.) As long as the channel is 
masked, the request has no effect and the output circuit 
ignores anything appearing on the data bus. 

When the initialization is complete, the appropriate 
DMA channel is unmasked and the data transfer is started. 

D. DIGITAL TO ANALOG CONVERSIONS 

An octal data latch held the signal stable so that the 
conversion process from a digital to an analog signal could 
take place. 

As can be seen in Figure 7, the data is stable (shown with 
a cross-hatch) on the data bus when the lOW* signal 
transitions from the active low state to the inactive high 
state in conjunction with the DACK3* signal being in its 
active low state. Coupling the logical or with the rising 
edge sensitive flip-flop a single point is defined for the 
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data latch. This is the point that primary memory is latched 
to retain the value until the next valid data.® 

When a standard eight bit digital to analog converter (DAC 
0800) was checked to meet the system specifications, its range 
of abilities was far beyond anything that the system needed 
both in terms of frequency and accuracy. The output of the 
octal latch is continuously fed to the converter where its 



®It is necessary to include the DACK3* signal with the lOW* 
signal to prevent data from being latched whenever the lOW* signal 
becomes active due to a memory refresh. The logical or is chosen 
only because of part and signal availability. The two needed 
signals are available on the expansion bus only in the active low 
states . 
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output is provided to the back of the computer as the analog 
signal out. 

The circuit was built on a Jameco JE36 PCB Breadboard 
which could be inserted directly in the expansion slot with 
output signals sent directly to a 25 pin connector on the 
computer back. 
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IV. SUMMARY AND CONCLUSIONS 



A. FLEXIBILITY 

A working MFQPSK personal computer based system was 
developed which not only meets the design constraints of a 10 
KHz to 14 KHz frequency response output at a sampling frequen- 
cy of 61.440 KHz but also provides a large measure of flexibi- 
lity in the following ways: (1) the expansion board and 
driving programs are able to be run on any of the industry 
standard fully compatible computers using processors ranging 
from the Intel 8088 to the Intel 80386 with no software or 
hardware changes required; (2) the frequency response of the 
system can be tuned to match different hardware constraints 
imposed merely by changing the values of the constants ALOW, 
BLOW, CLOW, DLOW, ELOW, EHI in the main program of Appendix 
A which adjust the frequency bins selected by the data 
encoding process to match the frequencies permitted; and (3) 
by changing only a single clock on the expansion board, the 
output sampling frequency can be modified from 0.1 Hz to 140 
KHz on an 8.0 MHz machine (the upper frequency limit can be 
extended to 1.3 MHz in the block transfer mode with the 
following block size limitations) . 
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Frequency Range 

0-140K 



Mode 

byte 

block 

block 

block 

block 

block 



Meix Block Size 

no limit 



80K-160K 

160K-320K 

320K-650K 

650K-1.3M 

1.3M 



256 

512 

1024 

2048 

4096 



B. CHANGES REQUIRED FOR LONGER MESSAGES 

With messages that result in a signal length longer than 
one block, ENCODEDATA should build a table of pointers 
representing the allocated memory locations where the blocks 
of data are represented. ComplexFFT should be called from 
within a loop that is executed until all blocks have been 
transformed. Since the FFT routine returns data in the same 
memory locations as when called, the table of pointers used 
on entry to the loop would be the same as those needed for 
output by the SCALEDATA routine. The DMAINIT routine would 
have to be modified to accept another parameter representing 
the current pointer to the base address. As mentioned previ- 
ously, the autoinitialization option would not be chosen since 
each data structure would be transmitted only once with each 
call giving a new base address. 

C. CURRENT AND FURTHER RESEARCH 

As was mentioned earlier, current research is being 
conducted to implement a receiver for the system using a 
similar application of personal computer based principles. 
Another project is researching an error detecting/correcting 
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scheme for translating the ASCII characters into the frequency 
and phase representations. 

A project is needed to settle on a method for synchroniz- 
ing the data blocks to protect against signal deterioration 
from the effects such as the transmission medium attenuation 
and differential channel delays for different frequencies. 
One intriguing concept worthy of further consideration is that 
a short correlation operation could be conducted as a parallel 
process utilizing the wasted portion of the instruction 
bandwidth mentioned in Section III.C.4 above. Since correla- 
tion requires only addition and subtraction operations which 
have a computational cost of approximately one-thirtieth that 
of the multiplies required for the FFT butterflies, the 
process could be designed to be conducted in real time during 
the transmission of a 4096 byte block of data. If not quite 
enough time was available for the synchronizing correlation, 
the block could be repeatedly transmitted the number of times 
necessary to extend the time of transmission while providing 
a simple means of error correction to the receiving system. 

D. ENHANCEMENTS 

The current progression towards very large scale 
integrated (VLSI) circuits makes it reasonable to expect an 
integrated device capable of performing a 4096 point FFT on 
a single chip within the next couple of years. A semicustom 
design could be pursued even now with a bitsliced approach 
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from a company such as Advanced Micro Devices in Sunnyvale, 
California. With their AMD 29000 series devices, a user could 
design a special purpose data flow (meaning no program counter 
is required) microprocessor using standard library components 
such as floating point arithmetic processing units. Semicus- 
tom devices, however, are still very expensive and would not 
improve the system enough to justify the added expense. 

Another potential approach which shows promise is to 
utilize a dedicated digital signal processor such as the 
products offered by the Zoran Corporation or Analog Devices, 
Inc., as a coprocessor. The trend with these devices is 
toward the capability of performing larger and larger FFTs on 
chip. The ZR34161 processor has an on device storage capabil- 
ity of 128 complex integers and can perform an FFT of that 
size in a single instruction. The continuing difficulty is 
that for each such FFT performed, 512 memory accesses must be 
made before and after the instruction is performed to preposi- 
tion the data in the appropriate cache memory locations on the 
coprocessor. The promise held, however, is that when enough 
data can be prepositioned in the on chip cache memory, a real 
time FFT can be performed at a reasonable cost per 
application. 
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APPENDIX A 



program THESIS; 

{This program is the main program which supports the MFQPSK 
system. It is a Turbo Pascal program which runs on a MS-DOS 
system with a math coprocessor. Subroutines called by this 
stub but not included in the file are ComplexFFT which is 
contained in FFT87B2.INC file and DMAINIT which is contained 
in the DMAINIT1.BIN file. The assembly language source code 
for the DMAINIT routine is contained in Appendix D of this 
thesis. } 

{$!-) {Disables I/O error trapping) 

{$R-) {Disables range checking) 

const 

TNArraySize = 4095; 

{THIS IS THE SIZE OF A BLOCK) 
lOerr : boolean = false; 

type 

NZERARRAY = array [1..256] of 0..4; 

{FOR THE CHOICES OF ONE OF FOUR 
PHASES OR FOR A NON-CHOICE OF 
0.0 REAL AND 0.0 IMAGINARY) 

BCSTARRAY = array [0. .TNArraySize] of byte; 

{HOLDS DMA DATA TO BE BROADCAST. 

THIS REPRESENTS AN ENTIRE BLOCK 
OF FFT PROCESSED DATA WHICH IS 
OUTPUT AS THE TIME SIGNAL) 

TNvector = array [0. .TNArraySize] of Real; 

{TYPE FOR BOTH THE REAL AND 
IMAGINARY DATA FOR THE INPUT TO 
THE FFT) 

TNvector Ptr = "TNvector; 

{PTR FOR FFT DATA ARRAY WHICH 
ALLOWS DYNAMIC ALLOCATION OF 
MEMORY BY THE "NEW" CONSTRUCT. 

IF MORE ARRAYS WERE REQUIRED 
TO BE KEPT FOR MULTIPLE BLOCK 
TRANSMISSIONS, THIS DYNAMIC 
ALLOCATION COULD KEEP THE DATA 
AVAILABLE FOR MULTIPLE BLOCK 
TRANSMISSIONS. ) 



var 

ALOW , BLOW , CLOW , DLOW , ELOW , EHI , I , INDEX , J , K , BINDEX , 
NUMBAUDS,NUMPTS,PHASECHOICE,TEMP : INTEGER; 
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DATAR,DATAI 

OUTFILE 



THE FILE 



: REAL; 

: TEXT; 

{THIS FILE WILL HOLD THE ASCII 
CHARACTERS WHICH REPRESENT THE 
HEX VALUES) 

; file of byte; 

(THIS FILE HOLDS THE ACTUAL HEX 
BYTES ) 



NZERO 



BCST 

XREAL,XIMAG 



: NZERARRAY; 

(KEEPS TRACK OF ASSIGNMENTS FOR 
REVERSALS ) 

; BCSTARRAY ; 

(ARRAY MADE AVAIL TO DMAINIT) 

; TNvectorPtr ; 



(PTRS FOR DATA SENT TO FFT) 
INVERSE, RANSELECT, SHORTCHOICE : BOOLEAN; 
DATA, ERROR : BYTE; 



($I FFT87B2.INC) 

($I COMPFFT.INC) 

($I COMMON. INC) 

procedure DMAINIT (var BCST : BCSTARRAY) ; 

(THIS PROCEDURE IS REALLY AN ASSEMBLY LANGUAGE ROUTINE 
BUILT IN THE FORMAT ACCEPTABLE TO TURBO PASCAL WHICH 
PASSES THE ADDRESS BCST ON THE STACK IN THE FORM OF FOUR 
BYTES OF DATA. THE FOUR BYTES OF THE ADDRESS ARE 
COMPOSED OF THE SEGMENT (TWO BYTES) AND THE OFFSET (TWO 
BYTES) . THE PROCEDURE TAKES CARE OF ALL INITIALIZATION 
OF THE DMA CONTROLLER AND STARTS THE OUTPUT PROCESS. 

THE OUTPUT PROCESS COULD BE SEPARATED BY UNMASKING THE 
DMA CHANNEL SEPARATELY. THIS PROCEDURE CAN BE CALLED 
REPEATEDLY BY MERELY SUBSTITUTING A NEW ADDRESS FOR A 
NEW 4096 BYTE BLOCK OF DATA. ) 
external ' DMAINITl . BIN ' ; 

(THE PROCEDURE CALLED HERE IS A BINARY FILE WHICH IS 
PRODUCED BY ASSEMBLING THE DMAINITl. ASM PROGRAM, LINKING 
THE RESULTING DMAINITl. OBJ FILE WITH THE MS-DOS PROGRAM 
LINK AND THEN EXE2BIN-ING THE FILE WITH THE MS-DOS 
PROGRAM EXE2BIN USING THE DEFAULT OUTPUT EXTENSION .BIN. 
THIS DMAINIT1.BIN FILE IS PLACED IN THE SAME DIRECTORY 
AS THE TURBO PASCAL COMPILER) 

procedure GETPHASECHOICE ; 

(THIS PROCEDURE IS WRITTEN ONLY TO BE ABLE TO SEE AN 
UNDERSTANDABLE RESULT OF THE FFT PROCESS WHICH IS A 
SIMPLE SINE-WAVE IF ONLY ONE FREQUENCY IS SELECTED. 

OF COURSE THE WHOLE PROCESS GETS MORE COMPLICATED IF 
MORE CHOICES ARE MADE BEFORE THE "NO MORE FREQUENCIES" 
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OPTION IS SELECTED SINCE THE RESULT OF 256 SELECTIONS 
IS A BAND LIMITED WHITE NOISE CASE. IT GETS THE PHASE 
CHOICES FROM THE KEYBOARD OR SELECTS THE PHASES RANDOMLY 
IF RANSELECT IS TRUE.) 



var 

GOODRESPONSE : boolean; 
RESPONSE : char; 



begin (GETPHASECHOICE) 
GOODRESPONSE := TRUE 
repeat 

WRITELN( 

WRITELN; 

WRITELN( 

WRITELN ( 

WRITELN ( 

WRITELN ( 

WRITELN ( 

WRITELN ( 

WRITELN ( 



WHICH ONE YOU WANT? ' ) 



•0 
•1 
•2 
' 3 
•4 
•N 
•R 



NOT THIS 
QUADRANT 
QUADRANT 
QUADRANT 
QUADRANT 
NO MORE 



FREQUENCY, THANKS') 
ONE ' ) ; 

TWO ' ) ; 

THREE ' ) ; 

FOUR ' ) ; 

FREQS PLEASE ' ) ; 



DECIDE FOR ME, PLEASE (RANDOM)') 



READLN (RESPONSE) : 
case RESPONSE of 

'O' 



' 1 ' 
'2 ' 
' 3 ' 
'4 ' 
'N' 



4 ; 
0 ; 
3 ; 
2 ; 
1 ; 



'R' . 

else 

begin 

WRITELN ( 'YOU 
GOODRESPONSE 
end ; 

end; (case) 
until (GOODRESPONSE 
end (GETPHASECHOICE) ; 



PHASECHOICE 
PHASECHOICE 
PHASECHOICE 
PHASECHOICE 
PHASECHOICE 
n' 
begin 

SHORTCHOICE := TRUE; 
PHASECHOICE := 4; 
end; 

r' : RANSELECT := TRUE; 



HAVE TO PICK 
:= FALSE; 



ONE OFF THE LIST! ' ) 



or RANSELECT) ; 



procedure ENCODEDATA 

begin (ENCODEDATA) 

NEW(XREAL) ; 



NEW(XIMAG) ; 



(THIS DYNAMICALLY ALLOCATES MEMORY 
FOR THE ARRAY NEEDED TO INPUT 
FREQUENCY BIN ASSIGNMENTS FOR THE 
FFT ROUTINE) 
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FILLCHAR(XREAL* ,SIZEOF(XREAL‘) ,0) ; (FILLCHAR FILLS 

THE NEWLY ALLOCATED MEMORY WITH 0 
VALUES } 

FILLCHAR(XIMAG* ,SIZEOF(XIMAG‘) ,0) ; 

ERROR := 0; {THIS SETS THE ERROR RETURN CODE SO 

THAT AN ERROR INDICATED BY THE FFT 
ROUTINE CAN BE RECOGNIZED) 
for I := 0 to NUMPTS-1 do 
begin 

if (I < BLOW) then 
begin 

DATAR := 0.0; 

DATAI := 0.0; 

end 

else if (I >= BLOW) and (I < CLOW) then 
begin 

J := J+I; 

if not (RANSELECT or SHORTCHOICE) then 
GETPHASECHOICE 
else if SHORTCHOICE then 
PHASECHOICE ;= 4 



else 



PHASECHOICE : = RANDOM ( 4 ) ; 
if (PHASECHOICE = 0) then 
begin 

DATAR := 80.0; 

DATAI := 80.0; 

end 

else if (PHASECHOICE = 1) then 
begin 

DATAR := 80.0; 

DATAI := -80.0; 

end 

else if (PHASECHOICE = 2) then 
begin 

DATAR := -80.0; 

DATAI := -80.0; 

end 

else if (PHASECHOICE = 3) then 
begin 

DATAR := -80.0; 

DATAI := 80.0; 

end 

else 

begin 

DATAR := 0.0; 

DATAI : = 0.0; 
end ; 



NZERO[J] 

end 



PHASECHOICE ; 



else if (I >= CLOW) and (I <= DLOW) then 
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begin 

DATAR := 0.0; 

DATAI : = 0.0; 

end 

else if (I > DLOW) and (I <= BLOW) then 
begin 

PHASECHOICE := NZERO[J]; 
if (PHASECHOICE = 0) then 
begin 

DATAR ;= 80.0; 

DATAI ;= -80.0; 

end 

else if (PHASECHOICE = 1) then 
begin 

DATAR ;= 80.0; 

DATAI := 80.0; 

end 

else if (PHASECHOICE = 2) then 
begin 

DATAR := -80.0; 

DATAI := 80.0; 

end 

else if (PHASECHOICE = 3) then 
begin 

DATAR := -80.0; 

DATAI := -80.0; 

end 

else 

begin 

DATAR := 0.0; 

DATAI : = 0.0; 
end; 

J • — J 1 ; 

end 

else 

begin 

DATAR := 0.0; 

DATAI : = 0.0; 
end ; 

XREAL'[I] := DATAR; 

XIMAG‘[I] := DATAI; 
end ; 

end{ENCODEDATA} 

procedure { SCALEDATA } 
begin 

V7RITELN ('WRITING THE FFT RESULTS TO DISK'); 

for INDEX := 0 to NUMPTS-1 do 

begin 

{*N0TE THAT THIS REPRESENTS A BIAS IN THE SAMPLE 
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OF HALF THE DYNAMIC RANGE OF THE DIGITAL TO ANALOG 
CONVERTER. THIS BIAS SHOULD BE SUBTRACTED BEFORE 
THE SIGNAL IS TRANSMITTED. FOR EXAMPLE, IF THE 
DYNAMIC RANGE OF THE DEVICE IS O-IOV, THE BIAS IS 
+5. THIS BIAS WAS DEVELOPED PRESUMING THE D/A DEVICE 
DOES NOT RANGE BETWEEN + AND - VALUES . ) 

TEMP := ROUND (XREAL" [INDEX] + 126.0); 

if TEMP < 0 then 
TEMP := 0; 

DATA := TEMP; 

WRITE (OUTFILE,TEMP, • '); 

WRITE (THEFILE , DATA) ; 

BCST[BINDEX] := TEMP; 

BINDEX := BINDEX + 1; 
end; 

end {SCALE DATA) 
begin (MAIN PROGRAM) 

ASSIGN (OUTFILE, '0UTFILA1.DAT' ) ; (THIS OUTPUT FILE IS 
DESIGNATED SO THAT AN EXPERIMENTER CAN ENSURE THAT 
HE HAS A REASONABLE OUTPUT PRESENTED TO THE DMA 
WHICH COULD CONCEIVABLY RESULT FROM THE FFT PROCESS 
TO WHICH HE FED THE INPUT. IF HE SELECTED ONLY ONE 
FREQUENCY, HE WOULD EXPECT TO SEE IN THIS FILE THE 
ASCII REPRESENTATION OF A SINUSOID) 

REWRITE (OUTFILE) ; {REWRITING MAKES THE FILE BLANK IF IT 
ALREADY EXISTS AND OTHERWISE CREATES A FILE BY THAT 
NAME) 

ASSIGN (THEFILE, 'H0UTFA1.DAT' ) ; {THIS OUTPUT FILE IS 
DESIGNATED SO THAT A BINARY OR HEXADECIMAL OUTPUT 
(DEPENDING ON YOUR POINT OF REFERENCE) IS CREATED 
WHICH CAN BE IMMEDIATELY OUTPUT TO A D/A CONVERTER. 
IN THE EARLIEST DAYS OF THIS PROJECT, THE DATA WAS 
OUTPUT THROUGH AN INDEPENDENT PROCESS FOR TEST AND 
THIS FILE WAS CONSTRUCTED SO THAT THE DATA WAS EASILY 
TRANSLATED) 

REWRITE (THEFILE) ; (REWRITING MAKES THE FILE BLANK IF IT 
ALREADY EXISTS AND OTHERWISE CREATES A FILE BY THAT 
NAME) 

INVERSE := FALSE; (THIS MEANS THAT WE ARE GOING TO ASK 
FOR A FORWARD FFT. THE ISSUE OF WHETHER A FORWARD OR 
INVERSE FFT IS ONLY A MATTER OF A FACTOR OF 1/PI) 

J := 0; (J PROVIDES THE INDEX FOR THE ARRAY WHICH 

RECEIVES THE 256 FREQUENCY BIN PHASE ASSIGNMENTS 0, 
1, 2, OR 3 REPRESENTING QUADRANT 1, 2, 3, OR 4 , OR 4 
REPRESENTING THE CHOICE OF NO FREQUENCY ASSIGNMENT AND 
THEN RETURNS THAT VALUE IN THE REVERSE SEQUENCE FOR 
THE MIRROR FREQ ON THE OTHER SIDE OF THE NYQUIST 
CENTER FREQUENCY) 
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{THIS IS WHERE THE VALUES ARE INSERTED FOR THE VARIETY OF 
CASES. THE CATEGORIES ARE BROKEN UP AS FOLLOWS; IN THE A 
RANGE FROM ALOW TO AHI WILL ALWAYS BE ZERO REAL AND ZERO 
IMAGINARY. IN THE B RANGE FROM BLOW TO BHI ARE A RANDOM SET 
OF QUADRATURE PHASE SIGNALS. THE B RANGE IS SELECTED BY 
CORRELATING THE REQUIRED FREQUENCY RESPONSE OF THE INTENDED 
OUTPUT DEVICE. THE FREQUENCY PHASES ARE SELECTED BY USING 
A RANDOM NITMBER GENERATOR TO BUILD AN EVENLY DISTRIBUTED 
SEQUENCE OF INTEGERS WITH VALUES RANGING FROM 0 TO 3 IF 
RANSELECT IS TRUE. IF SHORT CHOICE IS TRUE THEN THE VALUE IS 
ALWAYS SELECTED AS INTEGER 4 WHICH REPRESENTS A NONCHOICE OF 
FREQUENCY. AS THESE VALUES ARE GENERATED THEY ARE STORED IN 
THE NZERO MATRIX SO THEY CAN BE USED AGAIN FOR THE REVERSE 
SEQUENCE NECESSARY TO GENERATE A REAL TRANSFORM. AFTER EACH 
VALUE IS GENERATED AND TEMPORARILY STORED IN THE NZERO MATRIX, 
IT IS ASSIGNED A QUADRATURE PHASE REPRESENTATION BASED ON THE 
INTEGER VALUE ASSIGNED. I FOUND THAT A GOOD WEIGHTING TO GIVE 
THESE QUADRATURE SIGNALS WAS 80.0 REAL AND 80.0 IMAGINARY 
(WITH THE SIGNS DEPENDENT ON THE REPRESENTATION SPECIFIED BY 
THE INTEGER VALUE) . THE C RANGE IS AGAIN ASSIGNED VALUES OF 
ZERO REAL AND ZERO IMAGINARY AND CROSSES THE CENTER OF THE 
FREQUENCY SPECTRUM BEING SENT TO THE FFT ROUTINE. THE D RANGE 
IS THE RANGE WHERE THE B VALUES ARE REVERSED IN ORDER AND 
GIVEN THE VALUES OF THE B RANGE COMPLEX CONJUGATES. THIS IS 
ACCOMPLISHED BY READING THE NZERO MATRIX IN THE REVERSE ORDER 
AND EVALUATING THE INTEGERS AS IN THE B RANGE EXCEPT THAT THE 
IMAGINARY PARTS HAVE THE SIGNS REVERSED. THE E RANGE FROM 
ELOW TO EHI IS ZEROS IN BOTH REAL AND IMAGINARY PARTS . } 



NUMPTS ;= 4096; (THE COMBINATION OF NUMPTS AND NUMBAUDS 
IS INTENDED TO ALWAYS END UP BEING A TOTAL OF 4096 
POINTS PRESENTED TO THE FFT ROUTINE. FOR EXAMPLE 
ANOTHER SYSTEM COULD BE CHOSEN SUCH THAT NUMBAUDS MIGHT 
BE 16 AND NUMPTS BE 256} 

NUMBAUDS := 1; 

ALOW ;= 0; 

BLOW := 672; 

CLOW := 928; 

DLOW := 2768; 

ELOW := 3424; 

EHI := 4095; 

BINDEX := 0; {BROADCAST INDEX) 

RANSELECT ;= FALSE; {IF TRUE SELECTS RANDOM PHASES) 
SHORTCHOICE := FALSE; {IF TRUE SELECTS LESS THAN FULL LOAD 

OF FREQUENCIES) 
for K := 1 to NUMBAUDS do 
begin 

ENCODEDATA 

WRITELN ( ' PERFORMING THE FFT ' ) ; 

ComplexFFT (NUMPTS, INVERSE, XREAL,XIMAG, ERROR) ; 
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' , ERROR) 



WRITELN ('THE ERROR VALUE IS 

SCALEDATA 

DISPOSE (XREAL) ; 

DISPOSE (XIMAG) ; 

end; 

CLOSE (OUTFILE) ; 

CLOSE (THEFILE) ; 

DMAINIT(BCST) ; 

end. 
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APPENDIX B 



Assembly Language Output Loop Program 

example was compiled with the DEBUG program 
provided with all MS-DOS systems 



XXXX: 0100 


mov 


dx, 2000 


XXXX: 0103 


mov 


bx, 0112 


XXXX: 0106 


mov 


cx, 2000 


XXXX: 0109 


mov 


al , [bx] 



XXXX: OlOB 


out 


dx, al 


XXXX:010C 


inc 


bx 


XXXX: OlOD 


dec 


cx 


XXXX:010E 


jnz 


0109 


XXXX: 0110 


int 


20 


XXXX: 0112 







;placed 



;puts the port address in 
;dx register 

;puts address of data area 
;in bx register 
; specifies 4096 words of 
;data to be output 
;take word of data at address 
/pointed to by bx register 
/and put it in al register 
/puts byte of data in al on 
/the data bus 

/add one to the buffer address 
/decrease count remaining 
/loop until all bytes are out 
/returns system control to DOS 
/this is where the data is 
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APPENDIX C 



Assembly Language Program to Modify Refresh Time 
example was compiled with the DEBUG program 



/• provided 

9 


with 


all MS-DOS 


XXXX:0100 


mov 


al/74 


XXXX:0102 


out 


43, al 


XXXX: 0104 


mov 


al, ff 


XXXX:0106 


out 


41, al 


XXXX: 0108 


mov 


al, 00 


XXXX:010A 


out 


41, al 


XXXX:010C 


int 


20 



systems 



/•selects timer #1 

;43 is the port for the timer 

/•this is the value you will set 

/•for the lower 8 bit count down 

;41 is the port for the count 

/•preload 

/•this is the upper 8 bits for the 
/•count down 

/•the timer keeps track of how 
;many times it has been sent a 
/•value and knows that this is the 
; upper 8 bits since it is the 
; second time its address has been 
; decoded 

/•returns system control to DOS 
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APPENDIX D 



codeseg segment 

public dmainit 
assume cs; codeseg 

/procedure DMAINIT (var BCST ; BCSTARRAY) ; 

9 

/this procedure initializes dma channel 3 and sets the 
/parameters to output the array best by passing the start 
/address of the array on the stack. 





dma 


equ 


0 




dmapage 


equ 


8 Oh 




dwavent 


equ 


lOOOh 


dmainit 


proc 


near 






push 


bp 






mov 


bp,sp ; 


ruse bp to address stack 




les 


di,dword ptr[bp+4] /move address of 








best into es:di 




mov 


al , 5bh 


dma chan 3 single mode. 




out 


dma+11 , al 


read, autoinitialize 




out 


dma+12 , al 


reset first/last ff 




mov 


ax, es 


calc high order 4 bits 




mov 


cl , 4 


of buffer area 




rol 


ax, cl 






push 


ax ; 


isave ax for dma start addr 




and 


al, Ofh 






out 


dmapage+2,al /store in ch 3 dma page 








r register 




pop 


ax 






and 


al , Of Oh 






add 


ax,di i 


rget page offset 




out 


dma+6,al ; 


r output waveform buffer 




mov 


al,ah ; 


r start address 




out 


dma+6 , al 






mov 


ax, dwavent ; 


r output dma byte count 




out 


dma+7 , al 






mov 


al , ah 






out 


dma+7 , al 






mov 


al , 3 i 


r unmask ch 3 to start 




out 


dma+10 , al 






pop 


bp 






ret 


4 i 

i 


/pop 4 bytes off stack for 
/addr of best 


dmainit 


endp 






codeseg 


ends 








end dmainit 
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APPENDIX E 



The following two figures show the circuit used in the 
expansion board based MFQPSK system. The first is the logical 
schematic and the second is the actual expansion board pin 
diagram. 
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